A New Method for Extracting Key Terms from Micro-Blogs Messages Using Wikipedia

نویسنده

  • Ahmad Ali Al-Zubi
چکیده

This study describes how to extract key terms of the micro-blogs messages, using information obtained by analysing the structure and content of online encyclopaedia Wikipedia. The algorithm used for this target is based on the calculation of "keyphraseness" for each term, i.e., assess the probability that it may be chosen as a key term in the text. During assessment, the developed algorithm has shown satisfactory results in terms of this task, significantly outpacing other existing algorithms. As a demonstration of the possible application of the developed algorithm it has been implemented in a system prototype of contextual advertisement. And some options have been also formulated using the information obtained by analysing Twitter messages, for various support services.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Hybrid Method for Extracting Key Terms of Text Documents

key terms are important terms in the document, which can give high-level description of contents for the reader. Extracting key terms is a basic step for many problems in natural language processing, such as document classification, clustering documents, text summarization and output the general subject of the document. This article proposed a new method for extracting key terms from text docum...

متن کامل

Wikipedia Article Content Based Query Expansion in IR4QA System

This paper describes the work of our WUST group in NTCIR-8 on the subtask of English to Simplified Chinese and Simplified Chinese to Simplified Chinese information retrieval for question answering (EN-CS and CS-CS IR4QA). In order to enhance the precision and efficiency in question analysis, we employ a special question analysis method extracting more appropriate key terms and apply the query e...

متن کامل

Harvesting Domain-Specific Terms using Wikipedia

We present a simple but effective method of automatically extracting domain-specific terms using Wikipedia as training data (i.e. self-supervised learning). Our first goal is to show, using human judgments, that Wikipedia categories are domainspecific and thus can replace manually annotated terms. Second, we show that identifying such terms using harvested Wikipedia categories and entities as s...

متن کامل

Extracting Trust from Domain Analysis: A Case Study on the Wikipedia Project

The problem of identifying trustworthy information on the World Wide Web is becoming increasingly acute as new tools such as wikis and blogs simplify and democratize publications. Wikipedia is the most extraordinary example of this phenomenon and, although a few mechanisms have been put in place to improve contributions quality, trust in Wikipedia content quality has been seriously questioned. ...

متن کامل

Extracting location and creator-related information from Wikipedia-based information-rich taxonomy for ConceptNet expansion

Our research goal is to generate new assertions suitable for introduction to the Japanese part of the ConceptNet common sense knowledge ontology. In this paper we present a method for extracting IsA assertions (hyponymy relations), AtLocation assertions (informing of the location of an object or place), LocatedNear assertions (informing of neighboring locations) and CreatedBy assertions (inform...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013